Skip to content

feat(apiserver): implement APIService caBundle reconciliation#1808

Merged
praveenrewar merged 5 commits intocarvel-dev:developfrom
himsngh:add-certs-reconcilation
Apr 6, 2026
Merged

feat(apiserver): implement APIService caBundle reconciliation#1808
praveenrewar merged 5 commits intocarvel-dev:developfrom
himsngh:add-certs-reconcilation

Conversation

@himsngh
Copy link
Copy Markdown
Contributor

@himsngh himsngh commented Apr 2, 2026

What this PR does / why we need it:

Currently, kapp-controller executes a one-time patch of the v1alpha1.data.packaging.carvel.dev APIService object during pod initialization to inject its self-signed CA bundle.

During a rolling upgrade or node drain, a race condition occurs if an old pod restarts or terminates concurrently with the new pod booting up. The terminating pod can inadvertently overwrite the APIService with an invalid or dying caBundle after the new pod has completed its one-time sync. This permanently breaks internal cluster routing to the packaging API until the new pod is manually restarted.

This PR resolves the issue by moving from a one-time boot sync to a continuous, idempotent background reconciliation loop, ensuring the active pod continuously acts as the source of truth for the APIService configuration.

Which issue(s) this PR fixes:

Fixes #1807

Does this PR introduce a user-facing change?

Fixed a race condition during rolling upgrades that could cause the kapp-controller packaging API routing to permanently break due to APIService CA bundle drift.

Additional Notes for your reviewer:

  • Registered apiservice-ca-reconciler as a PostStartHook on the generic API server. This ensures the initial sync completes before the server reports as ready.
  • Implemented a lightweight wait.Until background goroutine that polls the APIService every 30 seconds.
  • updateAPIService now performs a bytes.Equal() check against the in-memory bundle. It only executes an HTTP Update when a configuration drift is detected
Review Checklist:
  • Follows the developer guidelines
  • Relevant tests are added or updated
  • Relevant docs in this repo added or updated
  • Relevant carvel.dev docs added or updated in a separate PR and there's
    a link to that PR
  • Code is at least as readable and maintainable as it was before this
    change

Additional documentation e.g., Proposal, usage docs, etc.:

NONE


Testing

  • Built and loaded the updated image into Minikube and performed a rolling upgrade.
  • Verified the initial valid certificate:
❯ kubectl get apiservice v1alpha1.data.packaging.carvel.dev -o jsonpath='{.spec.caBundle}'                                                                                                                                                                                                                                                                                      
LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURkekNDQWwrZ0F3SUJBZ0lJTUlUaytMREVQR2t3RFFZSktvWklodmNOQVFFTEJRQXdLREVtTUNRR0ExVUUKQXd3ZGEyRndjQzFqYjI1MGNtOXNiR1Z5TFdOaFFERTNOelV4TVRReE56UXdIaGNOTWpZd05EQXlNRFl4TmpFMApXaGNOTWpjd05EQXlNRFl4TmpFMFdqQWxNU013SVFZRFZRUUREQnByWVhCd0xXTnZiblJ5YjJ4c1pYSkFNVGMzCk5URXhOREUzTkRDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTHY4dldoMmxvdTgKVSs3MmlybE5IbUttVzhlR2paK0M3emoxWWx0cFlQM0NRU1h0cWlNMnVjdTFnY0grK2pYOWlsOElxK1Z2Q1Z0UAplSVl5dG1FNFVRbGNoR3o3UU1WSHRvVE9WNzB3OG1SdDFPaWJUallPbzQrSU56ZE9CSkZJaWZGa09Xb1F3Zmw0Cm01M2lqbFI5bFE4QkJhOUJFL1ZYT2N5SlVPSVVaTFpVUDlCOWMrUlBuamZaTnV4VzlFR2Q0Q045Y2ZERE9SUngKbFFUbFl3ajF1N1ZGRlh3aE9aaVcxSzVmdWFwNEtuMGhQbXNNSVM5RGl3ZmFwMnpWUzFpbEJRb2RNZFJmUEhqeApWekk0eUVQZnpNM0NFNjM1N25keUVtVFFiSTNLSHhDWkF4RXRDWHFhTmQ0MVNkTjVpaDNkb2YyTW9ma0dVWmVvCjJvK2VPOFhSZm0wQ0F3RUFBYU9CcHpDQnBEQU9CZ05WSFE4QkFmOEVCQU1DQmFBd0V3WURWUjBsQkF3d0NnWUkKS3dZQkJRVUhBd0V3REFZRFZSMFRBUUgvQkFJd0FEQWZCZ05WSFNNRUdEQVdnQlRtd1FjbXlaOGF2RzFoRmduZwpHa3dRd2hhdHJqQk9CZ05WSFJFRVJ6QkZnZzlyWVhCd0xXTnZiblJ5YjJ4c1pYS0NJWEJoWTJ0aFoybHVaeTFoCmNHa3VhMkZ3Y0MxamIyNTBjbTlzYkdWeUxuTjJZNElKYkc5allXeG9iM04waHdSL0FBQUJNQTBHQ1NxR1NJYjMKRFFFQkN3VUFBNElCQVFBTmRnZWVsOVFFWEVFNHIwdEJvbTZWbHV6b3d3emMrYUxnS1Bza0JnWlhId3FlOWpJcwpWV1U3ampYYmpUT3RGck1FSmZlOFIvOTF6dks0eVVIenlvUjBVeWZjOVlFbzBQTEViSnBPbDJJdC82dFVUbStOClN2TVBnNGZHV1AxUGYvbVZiT3lzNm93L2lHYzN6WkNpVU0yYXFCR1BQbXBOTi9Sem9UNnUrbDdJdzVXTGtnK2UKVkp3RVN3UHBUT1IrWVNCYjdselNFdTF1UkNtUDVBY2RlanBTb0JxREhpek44NndEcGpZUVJjOVRoK21WaWFyagpaWHd5aVZGbVpSQ0NlVVlhcytrRmhXUnZ6TUhuN1REUmdWbGlSeElhV2ZKZGNXYU9HUk1zYzU4VC81aWlTREUwClVwT0FXVks2cC9sbkJmR2hRaS9GRVp4c1VuM2lkRDlpRHN3ZQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCi0tLS0tQkVHSU4gQ0VSVElGSUNBVEUtLS0tLQpNSUlERkRDQ0FmeWdBd0lCQWdJSVova3VlUUw4Nlh3d0RRWUpLb1pJaHZjTkFRRUxCUUF3S0RFbU1DUUdBMVVFCkF3d2RhMkZ3Y0MxamIyNTBjbTlzYkdWeUxXTmhRREUzTnpVeE1UUXhOelF3SGhjTk1qWXdOREF5TURZeE5qRTAKV2hjTk1qY3dOREF5TURZeE5qRTBXakFvTVNZd0pBWURWUVFEREIxcllYQndMV052Ym5SeWIyeHNaWEl0WTJGQQpNVGMzTlRFeE5ERTNORENDQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFLWStJTGNTCmJyYW9zOUhhSkdXci81WlNxbW5rWUVYbmJsbVc4TlRmWXAvT2dYWDdTVzducTJvYmwzdTZ1NGN5QVJ0a21EN3MKZk5NL2NtZ2NrMDVTMVVBdkRQOG9MMjdEYnNpeVczRWxqelJVcVlxVDZpdE5vSnBJYmNrb0NyTVljY1dFeGJVOQpQeXp4aVhldlBGYU1XbE8rdmFaUFM1bkN6WkdWbTJMU0R1a2VMZUtSUjY3UWpQeG0yK205UjlxTUYrQ0h1NkFZCjRmRHptOHIxejNYVi9wRXlSVVlTRDVDUzgzNFY2ZnJ5RXF0TFh2Vkh0T0hwUi9DL3lpemVhZzg5T3o4VnVPR3oKZVBFMFRxZ1k2YWJPRkUrU3NFWHhuS3hpRHpTZ3lrUmxCaCtEdU96OGQrVDlueFl4WFUyVnNsMU9LSXVYS3NCdAp1S0x4aXlUcWExT01uUk1DQXdFQUFhTkNNRUF3RGdZRFZSMFBBUUgvQkFRREFnS2tNQThHQTFVZEV3RUIvd1FGCk1BTUJBZjh3SFFZRFZSME9CQllFRk9iQkJ5YkpueHE4YldFV0NlQWFUQkRDRnEydU1BMEdDU3FHU0liM0RRRUIKQ3dVQUE0SUJBUUI0MUpGaGNlVThVZm9vQ2oyYnNSbkJjN3krVWFRU25LUHFlcktjUVlFYkVLNGxwR3RjTGFUTQpUKzdLM0ZGem5aVWNrYzFsTFBjVFlEUGo4TWhGbHd5aTU4UXNwbFZidWg0RzU3ZXI0aG5ZYS9mYzJJNFVyN1NOCmpwalJnOEJ6N2FiV0dPZk1jaU1QTWdUdEhpMGFrR1UwYU9FazhJaENXZUJNYjR6SStVOS9PUFd3V2pBQThwYVYKR2EySlY1RmFxdGZPTzFMM25nYXNkOXdoTnliVndnMUZMM3h3VVRYRENWd1RhMnhKTlMyRG5EWlp2YjlLa3RNcApsRXZZQzVVVVA3RUJSMGNxOUpVaVdjWnVSc2ZNLytaNlhMT0p2OFNwdElFSmhqUHVsUHZqN3BMSnhacXNVWk1vCmoyNjZwZUREWk5MNmNwMkVvaXFpWnd4SktRU3pVT2tVCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K%                                                                                                                                                                                                                                                    ~ ❯
  • Patched the APIService with a dummy string to simulate an old pod overwriting the configuration:
  • Verified the APIService was temporarily broken:
❯ kubectl patch apiservice v1alpha1.data.packaging.carvel.dev --type=merge -p '{"spec":{"caBundle":"ZHVtbXktZGF0YS1zaW11bGF0aW5nLW9sZC1wb2Q="}}'                                                                                                                                                                                                                                
apiservice.apiregistration.k8s.io/v1alpha1.data.packaging.carvel.dev patched

~ ❯ kubectl get apiservice v1alpha1.data.packaging.carvel.dev -o jsonpath='{.spec.caBundle}'                                                                                                                                                                                                                                                                                      
ZHVtbXktZGF0YS1zaW11bGF0aW5nLW9sZC1wb2Q=%                                                                                                                                                                                                                                                                                                                                                    ~ ❯d
  • Confirmed the APIService was automatically restored to the valid "good" state
❯ kubectl logs -n kapp-controller deployment/kapp-controller | grep "Syncing CA certificate with APIServices"
Defaulted container "kapp-controller" out of: kapp-controller, kapp-controller-sidecarexec
{"level":"info","ts":"2026-04-02T07:16:14Z","logger":"kc.controller.apiserver","msg":"Syncing CA certificate with APIServices"}
{"level":"info","ts":"2026-04-02T07:17:44Z","logger":"kc.controller.apiserver","msg":"Syncing CA certificate with APIServices"}
 ❯
  • Validated the caBundle is successfull reconciled

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements continuous, idempotent APIService caBundle reconciliation to fix a race condition that occurs during rolling upgrades or node drains. Previously, kapp-controller performed a one-time sync at startup, which could be overwritten by a terminating old pod, permanently breaking the packaging API routing. The solution introduces a background reconciliation loop that continuously ensures the active pod's certificate is the source of truth.

Changes:

  • Registered apiservice-ca-reconciler as a PostStartHook that performs initial caBundle sync before the server reports ready, followed by a background goroutine that polls every 30 seconds
  • Modified updateAPIService to accept a context parameter and perform an idempotent bytes.Equal() check to avoid unnecessary API updates when the bundle hasn't changed
  • Updated newServerConfig return type to include the caContentProvider so it can be passed to the background reconciliation loop
  • Added comprehensive unit tests covering both update and no-op scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
pkg/apiserver/apiserver.go Added PostStartHook for APIService CA reconciliation with initial sync and background polling; modified function signatures and return types to support passing caContentProvider
pkg/apiserver/apiserver_test.go New test file with unit tests for updateAPIService function, covering drift detection and idempotent update scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/apiserver/apiserver_test.go
Comment thread pkg/apiserver/apiserver.go
Comment thread pkg/apiserver/apiserver_test.go
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/apiserver/apiserver_test.go
@himsngh himsngh force-pushed the add-certs-reconcilation branch 5 times, most recently from ecd6043 to c058e14 Compare April 3, 2026 11:11
himsngh added 4 commits April 6, 2026 12:31
Signed-off-by: Himanshu Singh <himansh.singh3@gmail.com>
Signed-off-by: Himanshu Singh <himansh.singh3@gmail.com>
Signed-off-by: Himanshu Singh <himansh.singh3@gmail.com>
Signed-off-by: Himanshu Singh <himansh.singh3@gmail.com>
@himsngh himsngh force-pushed the add-certs-reconcilation branch from 5fc7c7b to dd75238 Compare April 6, 2026 07:02
Signed-off-by: Himanshu Singh <himansh.singh3@gmail.com>
@himsngh himsngh force-pushed the add-certs-reconcilation branch from dd75238 to baecdc3 Compare April 6, 2026 08:15
@praveenrewar praveenrewar merged commit 32fc399 into carvel-dev:develop Apr 6, 2026
12 checks passed
@github-project-automation github-project-automation Bot moved this to Closed in Carvel Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

APIService caBundle routing breaks during kapp-controller rolling upgrades

4 participants